Buffer and Arithmetic Levels
----------------------------
Level 0
-------
=== mac ===

    SPECS
    -----
    Word bits             : 16
    Instances             : 98304 (768*128)
    Compute energy        : 1.41 pJ

    STATS
    -----
    Utilized instances      : 36864
    Computes (total)        : 37748736
    Cycles                  : 1024
    Energy (total)          : 53150220.29 pJ
    Area (total)            : 96888.42 um^2

Level 1
-------
=== scratchpad ===

    SPECS
    -----
        Technology                  : SRAM
        Size                        : 1
        Word bits                   : 16
        Block size                  : 1
        Cluster size                : 1
        Instances                   : 98304 (768*128)
        Shared bandwidth            : -
        Read bandwidth              : -
        Write bandwidth             : -
        Multiple buffering          : 1.00
        Effective size              : 1
        Min utilization             : 0.00
        Vector access energy        : 0.00 pJ
        Vector access energy source : ERT
        Area                        : 0.00 um^2

    MAPPING
    -------
    Loop nest:

    STATS
    -----
    Cycles               : 1024
    Bandwidth throttling : 1.00
    Weights:
        Partition size                           : 1
        Utilized capacity                        : 1
        Utilized instances (max)                 : 36864
        Utilized clusters (max)                  : 36864
        Scalar reads (per-instance)              : 1024
        Scalar updates (per-instance)            : 0
        Scalar fills (per-instance)              : 0
        Temporal reductions (per-instance)       : 0
        Address generations (per-cluster)        : 1024
        Energy (per-scalar-access)               : 0.00 pJ
        Energy (per-instance)                    : 0.00 pJ
        Energy (total)                           : 0.00 pJ
        Temporal Reduction Energy (per-instance) : 0.00 pJ
        Temporal Reduction Energy (total)        : 0.00 pJ
        Address Generation Energy (per-cluster)  : 0.00 pJ
        Address Generation Energy (total)        : 0.00 pJ
        Shared Bandwidth (per-instance)          : 1.00 words/cycle
        Shared Bandwidth (total)                 : 36864.00 words/cycle
        Read Bandwidth (per-instance)            : 1.00 words/cycle
        Read Bandwidth (total)                   : 36864.00 words/cycle
        Write Bandwidth (per-instance)           : 0.00 words/cycle
        Write Bandwidth (total)                  : 0.00 words/cycle

Level 2
-------
=== dummy_buffer ===

    SPECS
    -----
        Technology                  : SRAM
        Size                        : 0
        Word bits                   : 16
        Block size                  : 1
        Cluster size                : 1
        Instances                   : 6 (6*1)
        Shared bandwidth            : -
        Read bandwidth              : -
        Write bandwidth             : -
        Multiple buffering          : 1.00
        Effective size              : 0
        Min utilization             : 0.00
        Vector access energy        : 0.00 pJ
        Vector access energy source : ERT
        Area                        : 0.00 um^2

    MAPPING
    -------
    Loop nest:

    STATS
    -----
    Cycles               : 1024
    Bandwidth throttling : 1.00

Level 3
-------
=== input_buffer ===

    SPECS
    -----
        Technology                  : SRAM
        Size                        : 12288
        Word bits                   : 16
        Block size                  : 1
        Cluster size                : 1
        Instances                   : 6 (6*1)
        Shared bandwidth            : -
        Read bandwidth              : -
        Write bandwidth             : -
        Multiple buffering          : 1.00
        Effective size              : 12288
        Min utilization             : 0.00
        Vector access energy        : 4.97 pJ
        Vector access energy source : ERT
        Area                        : 41746.70 um^2

    MAPPING
    -------
    Loop nest:

    STATS
    -----
    Cycles               : 1024
    Bandwidth throttling : 1.00
    Inputs:
        Partition size                           : 12330
        Utilized capacity                        : 1088
        Utilized instances (max)                 : 6
        Utilized clusters (max)                  : 6
        Scalar reads (per-instance)              : 124932
        Scalar updates (per-instance)            : 0
        Scalar fills (per-instance)              : 34816
        Temporal reductions (per-instance)       : 0
        Address generations (per-cluster)        : 159748
        Energy (per-scalar-access)               : 4.53 pJ
        Energy (per-instance)                    : 724163.81 pJ
        Energy (total)                           : 4344982.89 pJ
        Temporal Reduction Energy (per-instance) : 0.00 pJ
        Temporal Reduction Energy (total)        : 0.00 pJ
        Address Generation Energy (per-cluster)  : 0.00 pJ
        Address Generation Energy (total)        : 0.00 pJ
        Shared Bandwidth (per-instance)          : 156.00 words/cycle
        Shared Bandwidth (total)                 : 936.02 words/cycle
        Read Bandwidth (per-instance)            : 122.00 words/cycle
        Read Bandwidth (total)                   : 732.02 words/cycle
        Write Bandwidth (per-instance)           : 34.00 words/cycle
        Write Bandwidth (total)                  : 204.00 words/cycle

Level 4
-------
=== shared_glb ===

    SPECS
    -----
        Technology                  : SRAM
        Size                        : 65536
        Word bits                   : 16
        Block size                  : 4
        Cluster size                : 1
        Instances                   : 1 (1*1)
        Shared bandwidth            : -
        Read bandwidth              : 16.00
        Write bandwidth             : 16.00
        Multiple buffering          : 1.00
        Effective size              : 65536
        Min utilization             : 0.00
        Vector access energy        : 41.71 pJ
        Vector access energy source : ERT
        Area                        : 590548.00 um^2

    MAPPING
    -------
    Loop nest:

    STATS
    -----
    Cycles               : 12816
    Bandwidth throttling : 0.08
    Inputs:
        Partition size                           : 73984
        Utilized capacity                        : 6528
        Utilized instances (max)                 : 1
        Utilized clusters (max)                  : 1
        Scalar reads (per-instance)              : 73984
        Scalar updates (per-instance)            : 0
        Scalar fills (per-instance)              : 73984
        Temporal reductions (per-instance)       : 0
        Address generations (per-cluster)        : 147968
        Energy (per-scalar-access)               : 10.36 pJ
        Energy (per-instance)                    : 1533518.16 pJ
        Energy (total)                           : 1533518.16 pJ
        Temporal Reduction Energy (per-instance) : 0.00 pJ
        Temporal Reduction Energy (total)        : 0.00 pJ
        Address Generation Energy (per-cluster)  : 0.00 pJ
        Address Generation Energy (total)        : 0.00 pJ
        Shared Bandwidth (per-instance)          : 11.55 words/cycle
        Shared Bandwidth (total)                 : 11.55 words/cycle
        Read Bandwidth (per-instance)            : 5.77 words/cycle
        Read Bandwidth (total)                   : 5.77 words/cycle
        Write Bandwidth (per-instance)           : 5.77 words/cycle
        Write Bandwidth (total)                  : 5.77 words/cycle
    Outputs:
        Partition size                           : 65536
        Utilized capacity                        : 2048
        Utilized instances (max)                 : 1
        Utilized clusters (max)                  : 1
        Scalar reads (per-instance)              : 0
        Scalar updates (per-instance)            : 65536
        Scalar fills (per-instance)              : 65536
        Temporal reductions (per-instance)       : 0
        Address generations (per-cluster)        : 131072
        Energy (per-scalar-access)               : 10.43 pJ
        Energy (per-instance)                    : 1366710.68 pJ
        Energy (total)                           : 1366710.68 pJ
        Temporal Reduction Energy (per-instance) : 0.00 pJ
        Temporal Reduction Energy (total)        : 0.00 pJ
        Address Generation Energy (per-cluster)  : 0.00 pJ
        Address Generation Energy (total)        : 0.00 pJ
        Shared Bandwidth (per-instance)          : 10.23 words/cycle
        Shared Bandwidth (total)                 : 10.23 words/cycle
        Read Bandwidth (per-instance)            : 0.00 words/cycle
        Read Bandwidth (total)                   : 0.00 words/cycle
        Write Bandwidth (per-instance)           : 10.23 words/cycle
        Write Bandwidth (total)                  : 10.23 words/cycle

Level 5
-------
=== DRAM ===

    SPECS
    -----
        Technology                  : DRAM
        Size                        : -
        Word bits                   : 16
        Block size                  : 4
        Cluster size                : 1
        Instances                   : 1 (1*1)
        Shared bandwidth            : -
        Read bandwidth              : -
        Write bandwidth             : -
        Multiple buffering          : 1.00
        Effective size              : -
        Min utilization             : 0.00
        Vector access energy        : 512.00 pJ
        Vector access energy source : ERT
        Area                        : 0.00 um^2

    MAPPING
    -------
    Loop nest:

    STATS
    -----
    Cycles               : 1024
    Bandwidth throttling : 1.00
    Inputs:
        Partition size                           : 73984
        Utilized capacity                        : 73984
        Utilized instances (max)                 : 1
        Utilized clusters (max)                  : 1
        Scalar reads (per-instance)              : 73984
        Scalar updates (per-instance)            : 0
        Scalar fills (per-instance)              : 0
        Temporal reductions (per-instance)       : 0
        Address generations (per-cluster)        : 73984
        Energy (per-scalar-access)               : 128.00 pJ
        Energy (per-instance)                    : 9469952.00 pJ
        Energy (total)                           : 9469952.00 pJ
        Temporal Reduction Energy (per-instance) : 0.00 pJ
        Temporal Reduction Energy (total)        : 0.00 pJ
        Address Generation Energy (per-cluster)  : 0.00 pJ
        Address Generation Energy (total)        : 0.00 pJ
        Shared Bandwidth (per-instance)          : 72.25 words/cycle
        Shared Bandwidth (total)                 : 72.25 words/cycle
        Read Bandwidth (per-instance)            : 72.25 words/cycle
        Read Bandwidth (total)                   : 72.25 words/cycle
        Write Bandwidth (per-instance)           : 0.00 words/cycle
        Write Bandwidth (total)                  : 0.00 words/cycle
    Outputs:
        Partition size                           : 65536
        Utilized capacity                        : 65536
        Utilized instances (max)                 : 1
        Utilized clusters (max)                  : 1
        Scalar reads (per-instance)              : 0
        Scalar updates (per-instance)            : 65536
        Scalar fills (per-instance)              : 0
        Temporal reductions (per-instance)       : 0
        Address generations (per-cluster)        : 65536
        Energy (per-scalar-access)               : 128.00 pJ
        Energy (per-instance)                    : 8388608.00 pJ
        Energy (total)                           : 8388608.00 pJ
        Temporal Reduction Energy (per-instance) : 0.00 pJ
        Temporal Reduction Energy (total)        : 0.00 pJ
        Address Generation Energy (per-cluster)  : 0.00 pJ
        Address Generation Energy (total)        : 0.00 pJ
        Shared Bandwidth (per-instance)          : 64.00 words/cycle
        Shared Bandwidth (total)                 : 64.00 words/cycle
        Read Bandwidth (per-instance)            : 0.00 words/cycle
        Read Bandwidth (total)                   : 0.00 words/cycle
        Write Bandwidth (per-instance)           : 64.00 words/cycle
        Write Bandwidth (total)                  : 64.00 words/cycle

Networks
--------
Network 0
---------
A2D_NoC

    SPECS
    -----
        Type            : SimpleMulticast
        ConnectionType  : 2
        Word bits       : 16
        Action Name       : transfer

    STATS
    -----
    Weights:
        Fanout                                  : 0
        Multicast factor                        : 0
        Ingresses                               : 0.00
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
    Inputs:
        Fanout                                  : 1
        Multicast factor                        : 1
        Ingresses                               : 102446.40
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
    Outputs:
        Fanout                                  : 1
        Multicast factor                        : 1
        Ingresses                               : 65536.00
        Energy (per-instance)                   : 21048024.10 pJ
        Energy (total)                          : 126288144.63 pJ

Network 1
---------
D2A_NoC

    SPECS
    -----
        Type            : SimpleMulticast
        ConnectionType  : 1
        Word bits       : 16
        Action Name       : transfer

    STATS
    -----
    Weights:
        Fanout                                  : 0
        Multicast factor                        : 0
        Ingresses                               : 0.00
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
    Inputs:
        Fanout                                  : 1
        Multicast factor                        : 1
        Ingresses                               : 102446.40
        Energy (per-instance)                   : 156833.14 pJ
        Energy (total)                          : 940998.85 pJ
    Outputs:
        Fanout                                  : 1
        Multicast factor                        : 1
        Ingresses                               : 65536.00
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ

Network 2
---------
DRAM <==> shared_glb

    SPECS
    -----
        Type            : Legacy
        Legacy sub-type : 
        ConnectionType  : 3
        Word bits       : 16
        Router energy   : - pJ
        Wire energy     : - pJ/b/mm
        Fill latency     : 0
        Drain latency     : 0

    STATS
    -----
    Weights:
        Fanout                                  : 0
        Fanout (distributed)                    : 0
        Multicast factor                        : 0
        Ingresses                               : 0.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.00
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Inputs:
        Fanout                                  : 1
        Fanout (distributed)                    : 0
        Multicast factor                        : 1
        Ingresses                               : 73984.00
            @multicast 1 @scatter 1: 73984.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.50
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Outputs:
        Fanout                                  : 1
        Fanout (distributed)                    : 0
        Multicast factor                        : 1
        Ingresses                               : 65536.00
            @multicast 1 @scatter 1: 65536.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.50
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ

Network 3
---------
dummy_buffer <==> scratchpad

    SPECS
    -----
        Type            : Legacy
        Legacy sub-type : 
        ConnectionType  : 3
        Word bits       : 16
        Router energy   : - pJ
        Wire energy     : - pJ/b/mm
        Fill latency     : 0
        Drain latency     : 0

    STATS
    -----
    Weights:
        Fanout                                  : 0
        Fanout (distributed)                    : 0
        Multicast factor                        : 0
        Ingresses                               : 0.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.00
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Inputs:
        Fanout                                  : 6144
        Fanout (distributed)                    : 0
        Multicast factor                        : 64
        Ingresses                               : 102446.40
            @multicast 16 @scatter 2: 5523.20
            @multicast 64 @scatter 32: 88371.15
            @multicast 64 @scatter 96: 8552.05
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 635.64
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Outputs:
        Fanout                                  : 6144
        Fanout (distributed)                    : 0
        Multicast factor                        : 96
        Ingresses                               : 65536.00
            @multicast 96 @scatter 64: 65536.00
        Link transfers                          : 0
        Spatial reductions                      : 6225920
        Average number of hops                  : 794.79
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ

Network 4
---------
scratchpad <==> mac

    SPECS
    -----
        Type            : Legacy
        Legacy sub-type : 
        ConnectionType  : 3
        Word bits       : 16
        Router energy   : - pJ
        Wire energy     : - pJ/b/mm
        Fill latency     : 0
        Drain latency     : 0

    STATS
    -----
    Weights:
        Fanout                                  : 1
        Fanout (distributed)                    : 0
        Multicast factor                        : 1
        Ingresses                               : 1024.00
            @multicast 1 @scatter 1: 1024.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.50
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Inputs:
        Fanout                                  : 1
        Fanout (distributed)                    : 0
        Multicast factor                        : 1
        Ingresses                               : 1024.00
            @multicast 1 @scatter 1: 1024.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.50
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Outputs:
        Fanout                                  : 1
        Fanout (distributed)                    : 0
        Multicast factor                        : 1
        Ingresses                               : 1024.00
            @multicast 1 @scatter 1: 1024.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.50
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ

Network 5
---------
shared_glb <==> input_buffer

    SPECS
    -----
        Type            : Legacy
        Legacy sub-type : 
        ConnectionType  : 3
        Word bits       : 16
        Router energy   : - pJ
        Wire energy     : - pJ/b/mm
        Fill latency     : 0
        Drain latency     : 0

    STATS
    -----
    Weights:
        Fanout                                  : 0
        Fanout (distributed)                    : 0
        Multicast factor                        : 0
        Ingresses                               : 0.00
        Link transfers                          : 0
        Spatial reductions                      : 0
        Average number of hops                  : 0.00
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Inputs:
        Fanout                                  : 6
        Fanout (distributed)                    : 0
        Multicast factor                        : 1
        Ingresses                               : 73984.00
            @multicast 1 @scatter 2: 67456.00
            @multicast 1 @scatter 6: 6528.00
        Link transfers                          : 134912
        Spatial reductions                      : 0
        Average number of hops                  : 1.22
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ
    Outputs:
        Fanout                                  : 6
        Fanout (distributed)                    : 0
        Multicast factor                        : 6
        Ingresses                               : 65536.00
            @multicast 6 @scatter 1: 65536.00
        Link transfers                          : 0
        Spatial reductions                      : 327680
        Average number of hops                  : 5.50
        Energy (per-hop)                        : 0.00 fJ
        Energy (per-instance)                   : 0.00 pJ
        Energy (total)                          : 0.00 pJ
        Link transfer energy (per-instance)     : 0.00 pJ
        Link transfer energy (total)            : 0.00 pJ
        Spatial Reduction Energy (per-instance) : 0.00 pJ
        Spatial Reduction Energy (total)        : 0.00 pJ


Operational Intensity Stats
---------------------------
    Total elementwise ops                   : 37748736
    Total reduction ops                     : 37683200
    Total ops                               : 75431936
    Total memory accesses required          : 139521
    Optimal Op per Byte                     : 270.32

=== scratchpad ===
    Total scalar accesses                   : 37748736
    Op per Byte                             : 1.00
=== dummy_buffer ===
=== input_buffer ===
    Total scalar accesses                   : 958488
    Op per Byte                             : 39.35
=== shared_glb ===
    Total scalar accesses                   : 279040
    Op per Byte                             : 135.16
=== DRAM ===
    Total scalar accesses                   : 139520
    Op per Byte                             : 270.33


Summary Stats
-------------
GFLOPs (@1GHz): 5885.76
Utilization: 0.03
Cycles: 12816
Energy: 205.48 uJ
EDP(J*cycle): 2.63e+00
Area: 0.00 mm^2

Computes = 37748736
pJ/Compute
    mac                          = 1.41
    scratchpad                   = 0.00
    dummy_buffer                 = 0.00
    input_buffer                 = 0.12
    shared_glb                   = 0.08
    DRAM                         = 0.47
    A2D_NoC                      = 3.35
    D2A_NoC                      = 0.02
    DRAM <==> shared_glb         = 0.00
    dummy_buffer <==> scratchpad = 0.00
    scratchpad <==> mac          = 0.00
    shared_glb <==> input_buffer = 0.00
    Total                        = 5.44

